Low Complexity Classification of Thermophilic Protein using One Hot Encoding as Protein Representation

نویسندگان

چکیده

The laborious, and cost-inefficient biochemical methods for identifying thermophilic proteins necessarily require a rapid accurate method proteins. Recently, machine learning has become more effective specific classes of extremophiles. There is still need low-cost proteins, despite the fact that studies employing yielded superior results to conventional methods. Here, we avoid problem manually crafted features, which involves experts defining extracting set features using only protein sequences as input various computational This study classifies their counterparts in one-hot encoding representation bidirectional long short-term memory (BiLSTM) model. model achieved an accuracy 92.34 percent, specificity 91 sensitivity 93.77 other models reported elsewhere rely on number features. In addition, trustworthy objective data independent evaluation make this competitive with other, models.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GENERATING FUZZY RULES FOR PROTEIN CLASSIFICATION

This paper considers the generation of some interpretable fuzzy rules for assigning an amino acid sequence into the appropriate protein superfamily. Since the main objective of this classifier is the interpretability of rules, we have used the distribution of amino acids in the sequences of proteins as features. These features are the occurrence probabilities of six exchange groups in the seque...

متن کامل

Determination of Protein Concentration Using Bradford Microplate Protein Quantification Assay

Background: Bradford protein assay is popular due to its ease of performance and relative sensitivity. Many researchers and laboratories in Iran use standard assay of Bradford by cuvette. No commercial kit was available for Bradford microplate assay in Iran. Meanwhile, imported Bradford commercial kits are very expensive and have a long delivery time in Iran. Till now no study or document on Br...

متن کامل

Prediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks

Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...

متن کامل

Evaluation of fisheries wastes as protein hydrolyzate

In Turkey, it is known that, wastes of fisheries processing plants released to the environment and only a bit of that assessed as fish meal or fish feed by a few plant. It is expected that with the evaluation of these wastes which rich in nutrient content by converted to the various commercial products (chitin, protein hydrolyzate, carotenoprotein and pıgment extraction etc.) to be beneficial i...

متن کامل

Protein classification using ontology classification

MOTIVATION The classification of proteins expressed by an organism is an important step in understanding the molecular biology of that organism. Traditionally, this classification has been performed by human experts. Human knowledge can recognise the functional properties that are sufficient to place an individual gene product into a particular protein family group. Automation of this task usua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Advanced Computer Science and Applications

سال: 2022

ISSN: ['2158-107X', '2156-5570']

DOI: https://doi.org/10.14569/ijacsa.2022.0131212